Key concepts and definitions:
Theory, hypotheses, operationalization, and measurement

PSCI 2270 - Lecture 2

Georgiy Syunyaev

Department of Political Science, Vanderbilt University

September 5, 2023

Plan for this week


  1. Causal Theories

  2. From Theory to Hypothesis

  3. Operationalization of Theory

  4. From Population to Sample

Plan for this week

  1. Causal Theories

Three Types of Empirical Questions



  • Predictive: Forecast future events from data

  • Descriptive: Summarize data, investigate facts, discover hidden patterns

  • Causal: Answer what-if’s

Three Types of Empirical Questions

  • Causation: Answer what-if’s

What is Correlation? Causation? Confounder?


  • Correlation: is any statistical association, though it commonly refers to the degree to which a pair of variables are linearly related.

  • Causation: indicates that one event is the result of the occurrence of the other event; i.e. there is a causal relationship between the two events. This is also referred to as cause and effect.

  • Confounder: (also confounding variable, omitted variable, or lurking variable) is a variable that influences both the dependent variable and independent variable, causing a spurious association.

Why is this funny?


Examples of correlation: Positive



Examples of correlation: None



What could correlation mean?


  • Suppose there are two factors that we know are positively correlated with each other (e.g. when \(X\) is higher, \(Y\) tends to be higher too).

  • \(X\) usually refers to independent/explanatory variable; \(Y\) – to dependent/outcome variable

Direct causation


  • Example: Coffee reduces chances of cardiovascular disease (and much more summary on popular health supplements here).

Confounding


  • Example: As ice cream sales increase, the rate of drowning deaths increases sharply. Therefore, ice cream consumption causes deaths.
  • IMPORTANT! Even if the correlation may lead us to understate or overstate the extent of causation, there may still be a causal relationship between X and Y.

Reverse causation


  • Example: The faster windmills are observed to rotate, the more wind we observe. Therefore wind is caused by the rotation of windmills.

Bidirectional causation


  • Example: Predator numbers affect prey numbers, but prey numbers, i.e. food supply, also affect predator numbers.
  • IMPORTANT! Most of the bidirectional causal relationships can be represented as recursive relationship, i.e. \(X_{t} \rightarrow Y_{t} \rightarrow X_{t+1}\), where \(t\) represents some time period

Spurious correlation (by chance)


  • Example: Per capita consumption of chicken in US is highly correlated with total US crude oil imports from 2000 to 2009 (much more here).

Is it causal?


Claim: Internet searches for the word “flu” increase the incidence of flu in the city from which the search arose.

Evidence: The more people in a city who do a Google search for the word “flu”, the more cases of flu there tend to be in that city.

Claim: Smoking causes lung cancer.

Evidence: People who smoke are more likely to contract lung cancer than people who don’t smoke.

Claim: Cell phone access increases violent protest.

Evidence: When a region in Africa gets cell phone coverage the frequency of violent political protests in the next year goes up.

Is it causal?


Claim: Experience of civil war causes people to develop more violent personalities.

Evidence: The more years of civil war a country has experienced since 1945, the more yellow and red cards its nationals get in club and international soccer matches.

Claim: Giving a student high grades causes them to perform better on standardized tests.

Evidence: Teenagers with higher high school GPAs get better scores on their SATs.

Claim: Super Bowl appearances are bad for health in the team’s home town.

Evidence: If you live in a town whose team makes it to the Super Bowl you are more likely to die from the flu in that year.

Is it causal?

Caution: Big Data and Prediction



  • Google Flu Trends: Predicting epidemics or public health issues based on the Google searches related to the flu-related information.
  • Moneyball: Based on a book about how the Oakland Athletics baseball team used analytics and evidence-based data to assemble a competitive team. It abandoned old predictors of success, such as runs batted in, for overlooked ones, like on-base percentage.
  • Simulated humans: Simulate individual survey responses using ChatGPT to approximate real life responses to anything (!) at no cost.

Plan for this week

  1. From Theory to Hypothesis

Directed Acyclical Graphs (DAGs)

  • A DAG displays assumptions about the relationship between variables (nodes).

    • The assumptions we make take the form of lines (edges) going from one node to another.
    • Edges are directed, which means to say that they have a single (!) arrowhead indicating their effect.
  • DAGs explain causality in terms of counterfactuals. A causal effect is defined as a comparison between two states of the world
  • In DAG notation, causality runs in one direction. Specifically, it runs forward in time. There are no cycles in a DAG.

Effects of Partisan Media

Page and Jones (1979)


Markus and Converse (1979)


Do NOT do THIS! 😵

Plan for this week

  1. Operationalization of Theory

How Good is Good Enough?

  • Concepts are latent:

    • We almost never observe “concepts”
    • Instead we rely on “indicators” or “proxies”
  • Indicators are concrete:

    • Concrete measure of a latent concept
    • Sometimes they’re “good,” sometimes they’re “rough”
  • Sometimes there is slippage between latent concept and proxy, e.g.

    • Responses to a specific policy question about affirmative action as a proxy for “racial resentment”
    • Outcomes measured via self-reports may be clouded by social desirability bias (e.g., self-reported voter turnout)
  • Important to make measurement as unobtrusive as possible

Validity vs. Reliability


  • Validity: measure what you intend to measure

  • Reliability: same answer over repeated measurements

Aggregation: Women’s Equality in the Arab World

  • Measures of equality:

    • Be president/prime minister
    • Work outside the home
    • Men better leaders
    • University for boys not girls
    • Equal job opportunities
    • Equal wages
    • Travel abroad alone
  • Combine this information:

    • Create an additive index
    • What do the numbers mean?

Unit of Analysis: Measures of Wealth



  • At the individual level:

    • Income? From wages? From capital gains?
    • Assets?
    • Consumer products? Calories consumed?
  • At the country level:

    • GDP? GDP/capita?
    • Energy consumption?
    • Infant mortality rate?

References

Markus, Gregory B., and Philip E. Converse. 1979. “A Dynamic Simultaneous Equation Model of Electoral Choice.” The American Political Science Review 73 (4): 1055–70. https://doi.org/10.2307/1953989.
Page, Benjamin I., and Calvin C. Jones. 1979. “Reciprocal Effects of Policy Preferences, Party Loyalties and the Vote.” The American Political Science Review 73 (4): 1071–89. https://doi.org/10.2307/1953990.